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ABSTRACT 

An adaptation of the standardization approach to 
assessing differential item functioning that applies to all item 
responses, including omits and not reached, is described. 
Applications of this method to evaluate differential speededness show 
that there is evidence of differential speededness for Blacks and 
Hispanics, but not for Asian Americans. Data from a study by A. P. 
Schmitt and C. A. Bleistein (1987) and from a recent form of the 
Scholastic Aptitude Test were used. There may be a dependency between 
differential speededness and test section location. Differential 
speededness may be more noticeable when the test section is located 
at the beginning of a test. Implications of these findings for 
evaluations of content-related differential item functioning and on 
differential test-taking strategies are described. (Contains five 
figures and eight references.) (Author/SLD) 
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Abstract 



An adaptation of the standardization approach to assessing 
differential item functioning that applies to all item responses, 
including omits and not reached, is described. Applications of this 
method to evaluate differential speededness show that there is 
evidence of differential speededness for Blacks and Hispanics, but 
not for Asian-Americans. There may be a dependency between 
differential speededness and test section location: Differential 
speededness may be more noticeable when the test section is .located 
at the beginning of the test. Implications of these findings for 
evaluations of content-related differential item functioning and on 
differential test-taking strategies are described. 
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THE STANDARDIZATION APPROACH TO ASSESSING 
DIFFERENTIAL SPEEDEDNESS 



The standardization approach to assessing differential item 
functioning (DIF), which is described in detail in Dorans (1987) and 
Dorans and Kulick (1983, 1986) for the analysis of the correct answer 
or keyed response, is readily adapted to all responses, including 
omits and not reached 1 . Schmitt and Bleistein (1987), in their 
analysis of the performance of Blacks on Scholastic Aptitude Test 
(SAT) analogy items, used the standardization method to examine 
DIF on distractors. In the process, they uncovered the phenomenon 
of differential speededness, i.e., differential response rates between 
focal group members and matched base group members to items 
appearing at the end of a section of a test. In the present paper, a 
description of the standardization approach to DIF as it generalizes to 
apply to all item options, including non-response, is presented. Then, 
data from the Schmitt and Bleistein (1987) study and a recent form 
of the SAT are used to illustrate how standardization uncovers 
phenomena such as differential speededness. 



1 When a candidate uoes not respond to an item, but responds to subsequent 
items, the non response to that item is referred to as an omit. If the candidate 
does not respond to any of the sub?, quent items, then the first non response 
and the subsequent non responses are characterized as "not reached". 
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Standardization and the Keyed Response 

In the traditional standardization analysis, an item is said to 
exhibit differential item functioning when the probability of 
correctly answering the item is lower or higher for examinees from 
one group than for equally able examinees from another group. The 
focus of DIF analyses is on differences in performance between 
groups that are matched with respect to the ability, knowledge or 
skill of interest. The basic elements of a standardization analysis of 
the keyed response are proportions correct at each level of a 
matching variable, such as total score, in a base or reference group 
and a focal or study group. Plots of these conditional proportions 
correct against score level in the focal and base groups provide a 
visual indication of the extent of DIF that an item exhibits. A plot of 
differences in conditional proportions correct between the focal and 
base group portrays the degree of DIF more directly. In addition to 
these plots, standardization provides numerical indices for 
quantifying DIF. 

The prime numerical DIF index that standardization computes is 
the standardized p-difference, which is defined as 

(1) DSTD= Z{W s [P {s -P bs ])/Z{W s } 9 

where [M^ s / I{^ s }] is the weighting factor at score level s used to 
weight differences in the proportions correct between the focal group 
(?f s ) and the base group (^bs)* and 2 is the summation operator 
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which sums these weighted differences across scores levels to arrive 
at DSTD, an index that can range from -1 to +1. Positive values of 
DSTD indicate that the item favors the focal group, while negative 
D S TD values indicate that the item disadvantages the focal group. 
D S TD values between -.05 and +.05 are considered negligible. 
D S TD values between -.10 and -.05 and between .05 and .10 are 
inspected to insure that no possible effect is overlooked. Items with 
DSTD values outside the (-.10, +.10} range are more unusual and are 
examined very carefully. 

The weights, [W s II*{W s }] y are the essence of the 
standardization approach. First, note that a common weight is. 
applied to both Pf s and /*bs* This contrasts with what occurs in the 
computation of impact, 

(2) IMPACT =P f -P 5 = 

2{N fs P fs }/Z{N fs } - Z{A^ 5s P 5s }/Z{iV bs }, 

where N f s and N\y S are the frequencies of score level s in the focal 
and base groups. In addition, the particular set of weights employed for 
standardization depends upon the purposes of the investigation. Some 
plausible options are the following: 

* W s = N ts , the number of examinees at s in the total group; 

* W s = #5 S , the number of examinees at s in the ba^e group; 
-* W s =Nf s> the number of examinees at s in the focal group; 

or * W s = the relative frequency at s in some reference group. 




In practice, W s =Nf s has been used because it gives the greatest 
weight to differences in Pf s and P\y S at those score levels most 
frequently attained by the focal group under study. Use of Nf s means 
that DSTD equals the difference between Pf, the observed 
performance of the focal group on the item, and P f, the imputed 
performance of selected base group numbers who are matched in 
ability to the focal group members. 



Standardization and All Response Options 



The generalization of the standardization methodology to all response 
options including omission and not reached is straightforward. It is as 
simple as replacing the keyed response with the option of interest in all 
calculations. For example, a standardized response rate analysis on 
option A would entail computing the proportions choosing A (as 
opposed to the proportions correct) in both the focal and base groups, 

(3) P fs (A) = A fs /tf fs ; P 5s (A) = A bs /N bs , 

where Af s and A bs are the number of people in the focal and base 
groups, respectively, at score level s who choose option A. The next 
step is to compute differences between these proportions, 



(4) 



^s(A) = />f s (A) - Pb s ( A )- 
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Then these individual score level differences are summarized across 
score levels by applying some standardized weighting function to these 
differences to obtain DSTD(A), 



the standardized difference in response rates to option A. In a similar 
fashion one can compute standardized differences in response rates for 
options B, C, D, and E, and for non-responses as well. 



Application of the standardization methodology to counts of 
examinees at each score who did not reach the item culminates in a 
standardized not-reached difference, 



For items at the end of a separately-timed section of a test, these 
standardized differences provide measurement of the differential 
speededness of a test. Differential speededness refers to the existence 
of differential response rates between focal group members and 
matched base group members to items appearing at the end of a 
section. Schmitt and Bleistein (1987) found evidence of this 
phenomenon for Blacks, as compared to a matched group of Whites, on 



(5) 



DSTD(A) = Z{W s [P fs (A) - P bs (A)]} / L{W S }, 



Differential Speededness 



(6) 



DSTD(NR) = Z{W s [P fs (NR) - P bs (NR)]} / Z{W S }. 
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analogy items. Schmitt and Dorans (1987) reported that this effect was 
also found for Hispanics. In the balance of this paper, differential 
speededness results for Black, Hispanic and Asian-American focal 
groups, compared to a White base or reference group, are presented and 
their implications are discussed. 

Figure 1 depicts standardized differential not-reached rates on the 
last ten items of the two verbal sections of the November 1983 form of 
the SAT that were observed for Blacks, Mexican-Americans, Puerto 
Ricans and Asian-Americans. For the purposes of this paper cross- 
group comparisons were made cautiously because different 
standardization weights were used for each ethnic group. It is evident 
from Figure la that Blacks reach the last ten items of the 45-item 
Verbal 1 section at a lower rate than a matched group of White 
examinees. The standardized differential not reached rate, DS 77) (NR), 
for Blacks hovers around .05 for all ten items. In contrast, the 
DSTD(NR) values for Asian- Americans on these same ten items are 
close to zero, indicating the absence of differential speededness. The 
DSTD(NR) rates for the two Hispanic groups are closer to the Black rates 
than the Asian-American rates, indicating that differential speededness 
exists for Hispanics as well as Blacks, but not for Asian-Americans. 



Insert Figure 1 about here 
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Figure lb depicts the standardized differential not reached rates for 
the last ten items of the 40-item Verbal 2 section. In contrast to Figure 
la, the differential speededness phenomenon builds up from no effect 
for any group at item 31 to a clear separation of the groups by item 40. 
Once again, the effect is most pronounced for the Blacks and non- 
existent for the Asian-Americans. The effect is minimal for Mexican- 
Americans, and it approaches the .05 level for Puerto Ricans on the last 
few items. 

Figure 2 depicts standardized differential not reached rates on the 
last ten items of the two verbal sections of the November 1984 form of 
the SAT that were observed for Blacks, Mexican-Americans, and Puerto 
Ricans. Figure 2a depicts the- fates for the last ten items on the 45 -item 
Verbal 1 section, while Figure 2b displays the rates for the last ten 
items on the 40-item Verbal 2 section. All three ethnic groups have 
standardized differential not reached rates near or above the .05 level 
on the Verbal 1 section with the Black group having the higher rates 
and the Puerto Rican group having the lower rates. This pattern is 
similar to that seen for the November 1983 form with the exception 
that the two Hispanic groups have exchanged locations on the plot. On 
the Verbal 2 section in Figure 2b, differential speededness is noticeable 
only for the Black group and, as in Figure lb, builds up from near zero 
on item 31 to over .05 on items 38 to 40. 



Insert Figure 2 about here 
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Figures 3, 4 and 5 depict standardized differential not reached rates 
on the last ten items of the two verbal sections of the November 1986 
form of the SAT that were observed for Asian-Americans, Blacks, and 
Hispanics 2 . Figure 3a portrays the DSTD(NR) rates for the last ten 
Verbal 1 items when the Verbal 1 section was the first section in the 
test, while Figure 3b displays the D5TD(NR) rates for the same ten 
items when the Verbal 1 section appeared as the third section of the 
test. As was the case for the November 1983 form, differential 
speededness is non-existent for Asian-Americans. For Blacks and 
Hispanics, the size of the differential speededness effect depends on the 
location of the section in the test: Differential speededness is more 
pronounced where Verbal 1 was the first section in the test, as seen in 
Figure 3a. 



Insert Figure 3 about here 



Figure 4a contains the rates for the last ten Verbal 2 items when the 
Verbal 2 section was the fourth section in the test, while Figure 4b 
contains the rates for the exact same items when the Verbal 2 section 
appeared as the first section of the test. Once again, no evidence of 
differential speededness for Asian-Americans exists. As was the case 
on the November 1983 and November 1984 forms, differential 
speededness on the Verbal 2 section for Blacks grows from near zero at 
item 31 to near the .05 level by item 38. This buildup is most evident 

2 Analyses for the November 1986 form combined all Hispanic examinees 
into one category. 
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on the second order where the Verbal 2 section appeared as the first 
section on the test. The Hispanic rates for the two different orders also 
demonstrate the importance of section location. In Figure 4a, 
differential speededness is virtually nonexistent for Hispanics; in Figure 
4b , evidence of differential speededness exists for Hispanics. 
Differential speededness is more pronounced where Verbal 2 was the 
first section in the test. In contrast to what is observed for Blacks and 
Hispanics, the absence of differential speededness for Asian- Americans 
seems to generalize across section location. 



Insert Figure 4 about here 



Figure 5 depicts average differential speededness rates for the 
Verbal 1 (Figure 5a) and Verbal 2 (Figure 5b) sections across the two 
orders of test booklets. In these average plots, differential speededness 
on the Verbal 1 is less pronounced for Blacks than it was in the two 
earlier November forms, which reflects the lower levels of differential 
speededness observed when the items do not appear in the first section 
of the test. (In both November 1983 and November 1984, Verbal 1 was 
the first section.) One sees in Figure 5 that, once again, differential 
speededness is non-existent for Asian- Americans. 



Insert figure 5 about here 
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Implications 

There are both methodological and substantive implications of the 
differential speededness phenomenon. 

Methodological 

If undetected, the differential speededness phenomenon may 
produce evidence of differential item functioning that might be 
misconstrued to be content-related when it is actually a function of item 
position. In other words, differential speededness may induce 
differential item functioning on items at the end of a test section simply 
because those items are at the end of the section. Differential 
speededness can confound the assessment of content-related 
differential item functioning, as Schmitt and Bleistein (1987) discovered 
in their analysis of differential item functioning for Blacks on SAT 
analogy items. In an attempt to adjust for differential speededness 
effects, a slightly altered version of the standardized p-difference is 
computed in which examinees at each score level who do not reach the 
item are excluded from the analysis. Schmitt and Bleistein (1987) 
employed this correction and found that it reduced much of the 
differential item functioning that had been evident on the sets of ten 
analogy items that appear at the end of the 45-item Verbal 1 section. 
This finding led the authors to conclude that differential speededness 
was a major contributing factor to the appearance of differential item 
functioning for Blacks on SAT analogy items, especially those which 
appeared towards the end of a section. It is an empirical question 
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whether the statistical adjustment for differential speededness yields a 
DIF index that accurately reflects what would be seen under unspeeded 
conditions. Recent research (Dorans, Schmitt & Curley, 1988) suggests 
that the correction used by Schmitt and Bleistein mitigates the effect of 
speed but does not eradicate it. 

The existence of differential speededness also has implications for 
the quality of matching that can be attained. Matching on a total score 
that is contaminated by differential speededness is likely to influence 
DSTD values in a small but systematic way. Bleistein and Schmitt 
(1987) found that as the unidimensionality of the matching variable 
increases, fewer item are flagged for DIF. In order to avoid the 
artifactual detection of DIF, it may be necessary to devise ways of 
removing the speed component from the matching score. 

Substantive 

The existence of differential speededness also has important 
implications for the advice given to test-takers and for test 
specifications. It appears that the speededness of the SAT-Verbal 
sections differentially affects matched groups of Whites and Blacks, and 
Whites and Hispanics. Differential speededness may be a consequence 
of differential test-taking strategies employed by different groups. For 
example, Whites may be more likely to skip over difficult items when 
confronted with them than would a matched group of Blacks. As a 
consequence, the matched Whites may be more likely to reach items at 
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the end of the test than are the Blacks, Indirect evidence for this 
differential strategy hypothesis can be garnered from examination of 
standardized differential omit rates, as was done by Rivera and Schmitt 
(1986) who found that Hispanics tended to omit less than matched 
Whites, 

The degree of differential speededness observed may be partially 
dependent upon section location. More differential speededness was 
evident on both verbal sections in the November 1986 data when these 
sections were the first in the test booklet. One plausible explanation for 
this location effect may be that the Black and Hispanic examinees as a 
group have less experience taking tests than the matched group of 
Whites and consequently may be less adept at pacing themselves early 
in the test. As a reaction to running out of time on the first section, 
they may quicken their pace through the later sections of the test and 
consequently dampening the differential speededness effect. 

Differential speededness is not a desirable test property. Its impact 
on test scores needs to be investigated. Differential speededness is 
bound to affect test scores when easy items are involved, as is the case 
for the first items among the the last ten analogy items on the 45-item 
Verbal 1 section of the SAT, because these easy items are likely to be 
answered correctly if they are reached. Test specifications need to be 
reexamined to ascertain what changes can be made to mitigate the 
impact of differential speededness. 



IV 
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